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The use of various cancer cell lines can recapitulate known 
tumor-associated mutations and genetically define cancer 
subsets. This approach also enables comparative surveys of 
associations between cancer mutations and drug responses. 
Here, we cinalyzed the effects of ~ 40,000 compounds on 
cancer cell lines that showed diverse mutation-dependent 
sensitivity profiles. Over 1,000 compounds exhibited unique 
sensitivity on cell lines with specific mutational genotypes, 
cind these compounds were clustered into six different classes 
of mutation-oriented sensitivity. The present analysis provides 
new insights into the relationship between somatic mutations 
and selectivity response of chemicals, and these results should 
have applications related to predicting cind optimizing thera- 
peutic windows for anti-cancer agents. [BMB Reports 2013; 
46(2): 97-102] 



INTRODUCTION 

identifying the effects and mechanisms of known drugs pro- 
vides perspectives for developing new cancer therapies. 
Approaches from systems biology and biolnformatics have 
been widely applied to discover new drug candidates with spe- 
cific cellular activities and mechanisms, and these approaches 
have mainly focused on the lineage-based classification of can- 
cer cell lines (1). Somatic mutations are important contributors 
to cancer progression and drug responses (2). A genotype-ori- 
ented analysis of compound response should thus be carried 
out using a wide variety of cancer cell lines. Accordingly, we 
used a new statistical method, termed Cell Line Enrichment 
Analysis (CLEA), to quantitatively analyze associations between 
genotype and drug sensitivity in cancer cell lines (2). 
Furthermore, this approach enabled us to measure the correla- 
tion between differentially expressed genes and mutational 
genotypes. 

Anticancer compound screening of 60 cell lines by the 
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National Cancer Institute (USA) (NCI60) was initiated in the 

late 1980s as a way to discover new drugs for leukemia. The 
NCI60 representing nine distinct tumor types (3): leukemia, co- 
lon, lung, CNS, renal, melanoma, ovarian, breast and prostate. 
The response data quantified the GI50 values for more than 
40,000 chemical compounds and the results have been made 
available in a public database (DTP, http://dtp.cancer.gov/). The 
GI50 represents the compound concentration required to inhibit 
the growth of exposed cells to 50% of that of untreated control 
cells. The NCI60 panel provides many opportunities for identi- 
fying the pathways and mechanisms related to cancer at both 
the molecular and genetic levels (4,5). Specifically, the NCI60 
panel of human tumor cell lines has been characterized at the 
molecular level. The analysis of RNA expression (DNA micro- 
array data) provides unique transcriptional features for each cell 
line (6), and single nucleotide polymorphism data have pro- 
vided estimates of DNA copy number variation at ~ 120,000 
sites (7). Additional types of molecular characterization of these 
cell lines include microRNA expression (8), DNA mutations (9), 
protein analysis (10), DNA methylation (11), functional target 
analysis (12) and the reverse phase protein array (RPPA) analy- 
sis (13). These data have been used to discover valuable rela- 
tionships between compound structure, mechanism of action, 
cell lineage and tumor mutations, among others. 

CLEA is a valuable tool, particularly for the identification of 
genotype-dependent compound sensitivity. Using this ana- 
lytical tool, we calculated the enrichment of each compound 
for each genotypic category of NCI60 cell line and then at- 
tempted to generate CLEA maps to select compounds with sig- 
nificant sensitivity againist one of the particular genotypes. 
Through hierarchical clustering analysis of GI50 data against 
various mutational genotypes, we attempted to confirm the ex- 
istence of clusters of compounds that were restricted in terms 
of specific genotypes. The aim of this study was to systemati- 
cally identify all potential compounds exhibiting specific sensi- 
tivity to a tumor genotype, and these findings should have ap- 
plications for identifying potential compounds for geno- 
type-oriented cancer therapies. 

RESULTS AND DISCUSSION 

The frequency of mutational genotypes in NCI60 cell lines 

The NCI60 cell lines have been extensively characterized at 
the molecular level, and mutation information for the NCI60 
cell lines is publicly available through the DTP website. A total 
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30 different mutations are annotated for these 60 cell lines, 
and we calculated the frequency of individual mutations in 
each cell line (Table 1). In summary, TP53 showed the highest 
mutation frequency, i.e., 44 of 60 cell lines (74.33%) harbored 
the TP53 mutation. CDKN2A also showed a relative high mu- 
tation frequency, as it was detected in 35 of 60 cell lines 
(58.33%). However, genes such as EGFR, BRCA2 and NF2 
showed a low frequency of mutation (5%) in the NCI60 cell 
lines, and NOTCHl, HRAS, MSH6 and VHL only showed a 
mutation frequency of 3.33%. Mutations in FBXW7, FLT3, 
PDGFRA, MAP2K4, and BRCA1 and gene amplification in 
KRAS, AKT2, and EGFR were just observed in one cell line 
(1 .67%). To ascertain statistical significance in the CLEA analy- 
sis (see Methods section for greater detail), genes having muta- 
tions or amplifications in more than 3 cell lines (TP53, 
CDKN2A, PTEN, KRAS, RAF, PIK3CA, APC, c-MYC-amp, 
STK11, CTNNB1, SMAD4, RBI, MLH1, NRAS, and TN_stro- 
mal) were selected for evaluating the association with com- 
pound response. 

Hierarchical clustering of genotype-specific compounds 

To identify patterns of genotype-specific compound responses, 
we selected subsets of compounds using the GI50 profile pat- 
tern on the CLEA map. First, the — logGlso value of 5 (i.e., 
Gl5o = 10 loM) was adopted as the bipartite cutoff to determine 
whether a compound was sensitive (>5) or insensitive (<5) to 
any of the cell lines in the NCI60 panel (14). Second, the en- 
richment score (AUG value) in the CLEA analysis was used to 
select genotype-specific compounds (see Methods section for 
greater detail). An AUG value of 0.85 and a P value of 0.01 
were used as cutoff values to ensure that compounds had a 
significant sensitivity for a particular genotype. We only in- 
cluded compounds that demonstrated strong potency (i.e., 
— logGlso value of >5) against at least one cell line in the 



NCI60 panel. As a result, a total of 1,161 non-redundant com- 
pounds were compiled, satisfying the above-mentioned filters. 
Hierarchical clustering was carried out for these selected com- 
pounds against 15 genotypic categories (Fig. 1). We identified 
six major groups of compounds that showed unique sensitivity 
based on mutational genotype; these compound clusters were 
sensitive to genetic mutations in the KRAS, STK11, MLHl, 
CTNNBl and BRAF genes, or sensitive to TN-stromal geno- 
type. This result provides direct clues for understanding com- 
mon mechanisms of action (MOA) between diverse com- 
pounds with similar applications. 

Genotype-dependent sensitivity of compounds 

From each of the six clusters shown in Fig. 1, we selected one 
representative compound and further analyzed its genotype- 
specific cellular response. Neuroblastoma RAS viral oncogene 
homolog (NRAS) is a member of the Ras gene family and enc- 
odes 21-kDa proteins that are members of the super family of 
small GTP-binding proteins. NRAS has diverse intracellular sig- 
naling functions that include control of cellular proliferation, 
growth, and apoptosis (15). Somatic activating mutations in 
RAS are present in up to 30% of all human cancers (16). We 
found that NSC639187 (Landomycin A) belonged to the clus- 
ter of NRAS sensitivity (Fig. 2A). Three cell lines harboring 
NRAS mutations demonstrated superior compound responses 
(i.e., GI50) as compared to wild-type cell lines. Landomycin A, 
a natural antibiotic, is known to induce the inhibition of DNA 
synthesis, interference with cellular processes critical for DNA 
synthesis and inhibit cell cycle progression from Gl/S phase to 
S phase (1 7), and we found that the cytotoxicity of this com- 
pound was, on average, > 10-fold higher in NRAS-mutant cell 
lines. 

STK1 1 (LKB1) encodes a serine-threonine kinase that di- 
rectly phosphorylates, and activates AMPK, a central metabol- 



Table 1. Frequency of diverse mutations in the NCI60 cell lines. In this study, 1 5 mutational genotypes with an occurrence in >3 cell lines (left col- 
umn) were selected for CLEA analysis to associate mutations with compound response {GI50). c-MYC-AMP, KRAS-Amp, AKT2-Amp and EGFR-Amp 
represent gene amplifications 



Mutation type 


Number of cell lines 


Frequency (%) 


Mutation type 


Number of cell lines 


Frequency (%) 


TP53 


44 


73.33 


EGFR 


3 


5.00 


CDKN2A 


35 


58.33 


BRCA2 


3 


5.00 


PTEN 


17 


28.33 


NF2 


3 


5.00 


KRAS 


14 


23.33 


NOTCHl 


2 


3.33 


RAF 


11 


18.33 


HRAS 


2 


3.33 


PIK3CA 


10 


16.67 


MSH6 


2 


3.33 


APC 


7 


11.67 


VHL 


2 


3.33 


c-MYC-Amp 


6 


10.00 


FBXW7 




1.67 


STK11 


6 


10.00 


FLT3 




1.67 


CTNNBl 


5 


8.33 


PDGFRA 




1.67 


SMAD4 


5 


8.33 


MAP2K4 




1.67 


RBI 


5 


8.33 


KRAS-Amp 




1.67 


MLHl 


4 


6.67 


AKT2-Amp 




1.67 


NRAS 


4 


6.67 


BRCA1 




1.67 


TN_stromal 


4 


6.67 


EGFR-Amp 




1.67 
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Fig. 1. Hierarchical clustering of the 1,161 compounds. The sig- 
nificance level (P value) of the enrichment score (AUC value) was 
used against the 15 genotypic categories for clustering. Six major 
groups of compounds with unique genotype-specific cellular re- 
sponses, are indicated by the corresponding mutated genes shown 
on the left. Red color represents sensitive responses (low GI50, 
AUC > 50) to the genotype, while the green represents resist- 
ance responses (high GI50, AUC < 50). 



ic sensor. AMPK regulates lipid, cholesterol and glucose me- 
tabolism in specialized metabolic tissues, such as the liver, 
muscle and adipose tissues (18). STKll protein is involved in 
two biologically important pathways that lead to cancer. First, 
STKll helps to maintain a polarized epithelium, and second, 
STKll activates the AMP-dependent kinase (AMPK), which 
controls the cellular energy balance (19). These insights into 
STKl 1 function suggest that it may represent a target of novel 
therapeutic strategies via its regulation of AMPK activity. 
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Fig. 2. Chemical structure and genotype-specific cellular response 
of various compounds. (A) An NRAS mutation-specific compound 
and its cellular response. (B) A STKll mutation-specific compound 
and its cellular response. (C) A MLHl mutation-specific compound 
and its cellular response. The enrichment of the mutant cell lines 
over the wild-type cell lines are displayed in a — logGlso waterfall 
plot. Compound specificity for the 15 mutational classes is dis- 
played in a bar graph. 



Furthermore, Metformin, a widely prescribed oral hypoglyce- 
mic for diabetes, is known to activate AMPK. In the present 
study, NSC650914 (Phenoxan) (Fig. 2B) was identified from 
cluster associated with STKll mutations, as shown in Fig. 1, 
and five cell lines harboring STKll mutations showed rela- 
tively high sensitivity to Phenoxan. This compound is known 
to affect the mitochondrial respiratory chain in human ovarian 
carcinoma cell lines treated with tumor necrosis factor-alpha 
(TNF-a) (20), and TNF-a is known to induce insulin resistance 
through the AMPK pathway (21). 

MutL homolog 1 (MLHl) is a gene commonly associated with 
hereditary nonpolyposis colorectal cancers (22). NSC741896 
(4-(l-benzofuran-2-yl)-5H-l,2,3-dithiazole-5-thione) was defined 
as having MLHl mutation-dependent activity (Fig. 2C). Four cell 



http://bmbreports.org 



BMB Reporls 99 



Mutation-specific compound response in cancers 
Ningning He, et a/. 



lines harboring MLHl mutations showed > 1 0-fold sensitivity to 
this compound in comparison to MLHl wild-type cell lines. The 
study by Konstantinova et al. provided the first evidence sup- 
porting the in vitro antiproliferative activity of 1,2,3-dithiazoles 
on human breast cancer cell lines (23). Here, we found that the 
MOA of this compound was more related to MLHl genotype 
than cell type (i.e., breast cancer cell line). 

CTNNBl is a regulator of cell adhesion and a key down- 
stream effector in the Wnt signaling pathway. CTNNBl has al- 
so been implicated in tumorigenesis through the phosphor- 
ylation and destabilization induced by CKl and GSK-3beta 
(24). We found that NSC731431 (Amarbellisine) possessed 
CTNNBl -dependent sensitivity against the NCI60 cell lines 
(Fig. 3A). Amarbellisine was previously reported as a strictly 
growth inhibitory and antiproliferative molecule in a general 
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Fig. 3. Chemical structure and genotype-specific cellular response 
of various connpounds. (A) A CTNNBl mutation-specific compound 
and its cellular response. (B) A BRAF mutation-specific compound 
and its cellular response. (C) A TN stromal-specific compound and 
its cellular response. The enrichment of the mutant cell lines over 
wild-type cell lines are displayed in a — logGho waterfall plot. 
Compound specificity for the 15 mutational classes is displayed in 
a bar grapln. 



cancer drug discovery study (25), and its strong association 
with CTNNBl mutations should help to further understanding 
this drug's MOA and anticancer applications. 

BRAF is an oncogene that encodes the B-Raf protein, which 
is involved in intracellular signaling and cell growth. BRAF was 
shown to be frequently mutated in human cancers (26), and the 
V600E mutation of the BRAF gene has been associated with 
hairy cell leukemia in numerous studies (27). BRAF mutations 
yielded the most statistically significant associations with com- 
pound activity. NSC46061 (butanedioic acid compound with 
10-[3-(4-methyl-l-piperazinyl) propyl]-2-(trifluoromethyl)-10H- 
phenothiazine (2 : 1)) was selected as a representative of the 
BRAF-dependent compound cluster (Fig. 3B), as it has been re- 
ported to represent a signature compound associated with mu- 
tations in the RAS-BRAF pathway (28). Additional compounds 
in the same cluster provide a promising resource for the devel- 
opment of new BRAF-specific cancer therapies. 

Tenascin (TN) was previously shown to be highly expressed 
in cancer cells (29), and we identified NSC127716 (2-deoxy-50- 
azacytidine; Decitabine, Dacogen) as displaying a TN-stro- 
mal-dependent cellular response (Fig. 3C). Decitabine re- 
activates unmethylated p21 and, in some cases, this effect is 
independent of wild-type p53. Decitabine was also shown to 
restore the expression of Apafl in primaryAML cells and in- 
crease the susceptibility of bladder TCC to cisplatin. Other re- 
ports have also demonstrated that this compound potentiates 
p53 inducibility of NOXA, activates reexpression of p73 in 
AML cells and mediates cell cycle arrest in the G2/M phase via 
the p38 MAP kinase pathway (30, 31). Thus, we believe that 
TN-stromal can be used as a unique marker to predict the sen- 
sitivity of cancer cells to decitabine. 

NC160 chemical screening data have been used to identify 
new anticancer agents and understand the MOA of anti-cancer 
compounds (32). Although many studies have demonstrated 
correlations between chemical structure and MOA or cancer 
lineage, the importance of cancer genotype in compound re- 
sponses has not been appropriately addressed. By taking ad- 
vantage of publicly available genotype data regarding the 
NCI60 cell lines and CLEA technology, we identified a subset 
of compounds exhibiting significant associations with cancer 
genotype in terms of their cellular response. In the CLEA analy- 
sis, hierarchical clustering of selected potent compounds re- 
vealed a total of six clusters of compounds that were sensitive 
to mutations in KRAS, STK11, MLHl, CTNNBl, and BRAF 
genes and TN-stromal genotype. Many non-NCI60 compounds 
in previous studies were also retrospectively validated for their 
genotype-specific activity (33, 34). In the present study, we 
identified a subset of compounds with specific activity against 
the major cancer genotypes present in the NCI60 cell line 
collection. These results provide a unique resource for opti- 
mizing anticancer therapies and new drug discovery. 
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MATERIALS AND METHODS 
NCI60 response database 

The NCI60 response database contains more than 40,000 com- 
pounds with negative log-transformed GI50 values ( — logGbo), 
which can be used to characterize sensitivity across the 60 can- 
cer cell lines. Briefly, the cell lines were grown in 96-well 
plates and exposed to the test compound for 48 hours. Growth 
inhibition was expressed in terms of the GI50, i.e., the concen- 
tration required to inhibit cell growth by 50% in comparison to 
that in untreated controls (3). We filtered out 9,075 compounds 
with missing data (— logGlso values available for less than 45 
cell lines) or those that possessed a minimal level of variance 
across the 60 cell lines (a standard deviation less than 0.1 
across the available lines). The — logGho values of the remain- 
ing 34,921 compounds were used for further analysis. 

Statistical analysis 

Cell Line Enrichment Analysis (CLEA) was designed as a stat- 
istical analysis method to associate experimental data 
(compound response, gene/protein expression and protein 
phosphorylation) with cancer genotypes of gene mutations. We 
previously reported the use of CLEA for associating chemical 
activity with other cellular parameters (2). Briefly, the prioritiza- 
tion of cell lines of particular genotypes for a specific com- 
pound was analyzed on a Receiver Operating Characteristic 
curve (ROC) plot. The Area Under the Curve (ALJC) of the ROC 
plot was used as a measure of "sensitivity" or "resistance". The 
AUC score will around 50 if random enrichment and near 100 
if perfect enrichment. The statistical significance (P value) for 
the AUC score was assigned through 1,000 permutation tests. 

Software support 

The 2D structures and annotations of the NCI60 compounds 
were displayed using MarvinSchetch developed by ChemAxon 
(http://www.chemaxon.com/). Hierarchical clustering was car- 
ried out using Cluster3.0 (developed by Human Genome 
Center, June 2002, ((35). Tree Viewer (developed by Eisen's 
laboratory, (36)) was used to visualize the clustered data. 
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